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Abstract 


Consider:  the  linear  regression  model  y  =  x'g  +  e,  i  =  1,  2  .  .  .  .  ,  where 

1  iii 

{x }  is  a  sequence  of  known  p-vectors,  8'  =  (2  . 8  I  is  an  unknown  p- 

.  ,i  -p 

vector,  known  as  regression  coefficients,  {e }  is  a  sequence  of  random  errors  It  is 

I 

of  interest  to  test  the  hypothesis  H^:  8K+1  =  .  .  =  £■  =  0.  k  =  0.  1  .  .  .  ,  p. 

We  do  not  assume  that  the  random  errors  are  identically  distributed  and  have  zero 
means,  since  it  is  sometimes  unrealistic.  As  a  compensation  for  this  relaxation,  we 
assume  the  errors  have  a  common  bounded  support  [a^  Under  certain 

conditions,  we  obtain  the  strongly  consistent  estimate  of  the  number  k  for  which  8^ 
A  0  and  8  =  .  =  8  =  0,  by  using  the  information  theoretical  criteria. 

k+ 1  p 


1.  Introduction 


Consider  the  linear  model 


y.  -  x.'B  +  e. , i=l ,2,  ,n. 

(1) 

where  x  s  are  experiment  points,  B  =  <  B  -  ,B  )'  is  the  regression  coefficient 

i  1  n 

vector  to  be  estimated,  and  e  s  are  random  errors  In  the  usual  linear  regression 

I 

model  it  is  assumed  that  the  random  errors  have  vanishing  expectations  and  common 
variance.  In  this  case,  the  famous  least  square  estimation  (LSE)  method  plays  an 
important  role  in  making  statistical  inference  upon  the  regression  coefficient  vector 
g  In  the  literature,  there  are  a  lot  of  papers  concerning  with  the  LSE  and  many 
important  results  are  obtained  la  part  of  work  refers  to  [1],[2]  and  [3]). 
However  the  unbiasedness  and  consistency  (even  the  weak  one)  of  LSE  strongly 
depend  on  the  assumption  that  the  expectations  of  errors  are  zero,  and  this 
assumption  is  not  realistic  sometimes.  It  is  of  interest  to  find  a  consistent  estimates 
of  the  regression  coefficients  when  the  expectations  of  errors  are  not  equal  to 
each  other  In  [4]  two  methods  for  finding  consistent  estimates  of  the  regression 
coefficient  vector  3  are  proposed 


The  first  method  is  to  use  the  measure 


Q  (3)  =  .Max  (y.  -  x.'B)  -  Min  (y.  -  x. ' 3) 
n  I  <  i  <  n  i  i  1  <  i  <  n  i  i 


The  estimator  3  of  8  's  defined  as  the  vector  which  minimizes  Q  (8)  The  estimate 

n  n 

8  is  temporarily  called  MD  estimate  of  3  in  [4]  (the  estimate  based  on  the 

n 

Maximum  Difference  between  residuals) 


The  second  method  is  to  use  the  measure 


Q„(B) 


.Max 
1  <  i  <  n 


|y; 


x.’B| 


Denote  by  8  the  value  of  8  which  minimizes  Q  ( 8 )  Also.  8  >s  temporarily  called 

n  n  n 

MA  eatimate  of  8  (the  estimate  based  on  the  Maximum  Absolute  values  of  residuals) 


Choose 


and 


k  «  ArgMin {R^ :  k  e  {0,  .  ,p}} 


I 

> 

I 

k  =  ArgMin^:  k  e  (0,  ..  .  ,p}} 

i 

where  ArgMin  denote  the  index  which  minimizes  the  quantities  following  the  symbol 
ArgMin. 

In  this  paper  we  shall  consider  the  consistency  of  k  and  k  to  the  true  model 
k 

0 


2.  Consistency  of  k 

In  this  section,  we  make  the  following  general  assumptions: 

Assumption  1  The  errors  e ,  i  =  1,2,  ..  .  are  independent 

I 

Assumption  2  P{e  e  [a  ,  a  ]}  =  0  and  there  is  a  positive  constant  A  such 

n  1  2 

that  for  any  e  >  0  and  any  n,  we  have 

# 

P(en  c  [a,,  a,  +  e]}  i  4e 
and 

P{e  e  [a  -e,  a  ]}  >  Ae  ■ 
n  i  2 

Assumption  3  For  any  a  >  0,  there  exists  a  positive  constant  C  such  that  for 
any  vector  a  *=  0  it  follows  that 

#{i  <  n,  |£(x.)-£(a)  |  <  a}  >  Cn 
for  large  n,  hereafter  £(a)  =  a/ 1  a  | 


.y. 


Assumption  4.  There  exists  a  positive  constant  m  such  that 


I  x .  I  >  m,  for  i  =  1,2,  ... 


Now  let  us  estimate  Q  (g  ).  Define 

n  n 


E  =  {i  <  n,  -x! (g  -  g)  >  0} 
n  -  i  n 


E  ■  { i  <  n,  x!  (g  -  g)  >  0} 
n  i  n 


Split  S  =  {  x  e  RP  ]  x  |  =  1}  into  d  disjoint  parts  Z  ^  ,  Zrf  such  that  V 

x,  y  t  I ,  x'y  >  3/4.  Let  y  e  I  ,  j  =  1,  .  ,d.  Define  EJ  =  {i  <  n,  £(x  )'y  > 

J  J  i  n  -  i  j 

3/4},  j  =  1 . d.  By  Assumtion  3,  there  exists  6  >0  such  that 


#(EJ)  >  6  n ,  j  =  1,2.  .  .  ,d. 

n  l 


It  is  easy  to  see  that  -S,(0  -  0)  e  Z  and  i  e  E  implies  that 

n  •  j  n 


-X.  1  z  (g  -  g)  >0, 
i  n 


i .e.  i  e  E 


and  that  £(g  -  g)  e  Z  and  i  e  E  implies  that 

n  j  n 


x.  1  £(g  -  g)  >  0,  i  .e.  i  e  E 
i  n  n 


Take  r  satisfying 


r  -*.0  and  nr  /loan  ®, 
n  n 


we  have 


P  (Q  (8  )  5  a  -  a  -  2r  ) 
n  n  2  1  n 


<  P  (max  ,  .  e .  <  a  -  r  )  +  P(  miry,,  e.  >  a  +  r  ) 

.  r  U)  i  2  n  .  .  (2)  i  _  1  n 

i eE  i e  E 

n  n 

d 

<  l  P(  max^  e.  <  a2  -  rn>-£(Bn  -  8)  e  ZJ 


+  l  P(  m  i  i-l  e.  >  a  +  r  ,£((3  ~B)  £  2-) 

j-  i  i  cE  2  '  '  n  R  J 

n 
d 

<  [  P(  max  e.  <  a  -  r  ,-£(8  -  8)  e  I.) 

J-1  1pEJ  '  2  n  n  J 


+  I  P  (  min  e.  >  a  +  r  ,£(g  -  g)  e  Z.) 

1-1  ieEJ  1  “  '  J 

n 


1  l  P (  max  e.  <  a,  -  r  ) 
.  1-'  iCEJ  1  '  2  " 


+  V  P  (  mi n  e.  >  a,  +  r  ) 

1-1  ieEJ  '  "  '  " 

n 

<  2d(l-Ar  )^ln  <  2de  ^n^l0  <  2d/n^ 
-  n  -  - 


for  large  n.  By  Borel-Cantelli  Lemma  we  have 


Q  (B  )  >  a  -  a  -  2r  ,  a.s 
n  n  2  1  n 


when  n  is  large  enough. 


Let  kQ  be  the  index  of  the  true  model  and  let  BQ  be  the  true  parameter 
Then  obviously  we  have  .for  p  >  k  >  k 


Q  (6  )  =  Q  (6)  <  Q  (&) 

n  n  n  pn  n  kn 


i  W  ±a2  -  a, 


0  <  Q  (a  )  -  Q  (a  )  <  2r  ,  p  >  k  >  k 

n  n  T<n  n  -  0 


If  we  take  C  such  that  C  0,  C  /r  -►  oo  then  for  k  >  k 

n  n  n  n  0 


R  -  R  -  ,k  -  kjc  +  Q  (a  )  -  q  (a  )  >  0, 

k  k  0  n  n  T<n  n  Tc  n 

0  0 


for  all  large  n. 


Next,  we  consider  the  case  of  k  <  k  .  Denote 

0 


ti  =  |R  |  >  0 
0 


and  define 


Ep  =  {i  <  n,  |£  (x.)  +  £  (Bkn  -  B  )  i  <  1/2} 
*  (i  <  n.|£(x.)  -  -  B0>  |  <  1/2) 


Split  S  into  b  disjoint  parts  1^  .  . 
1/4  Let  £  e  II .  j=  1,  ,  b  Define 


,  n  such  that  V  x,  y  e  II 

b 


Fn  =  {i  <  n ,  | £ (x . )  -  |  <  1/4} ,  j  =  1,  .  ,b. 


By  Assumption  3.  there  exists  62  >  0  such  that 


#(FJ) 

n 


>  <52n> 


1,2, 


,b. 


It  is  easy  to  see  that 


n. 

j 


and  i  e  F" 


which  implies  that 


l*(x.)  +  U\n  -  P0)  |  <  1/2.  i  .e.  i  e  E*. 

Also. 


A  (£  •  8n)  E  n .  and  i  e  FJ' , 

kn  0  j  n 


which  implies  that 


|A(x.)  +  M8kn  -  Bq)  |  <  1/2,  i.e. 


E  E 

n 


For  i  t  E  .  we  have 

n 


xi(Bkn-6o»  '  l’‘illBkn-e0|Mxj).ji('ekn  -  bq) 


■ViKl.  ) 


,  A.*' 


-  ^,/s  -V 


l.iI.1,/-^; 


Similarly  for  i  e  E  (  we  have 
n 


Xi  (ekn  "  V  ~  mTl/2' 


Hence 


Qn  (g^)  >  ma>j_e.  -  mine.  +  mT} 

i  eE  i  eE 

n  n 


P(Qn(ekn’  ^  a2  '  al  +  m”/2) 


<  P(  ma^  e.  <  -  mr\/U)  +  P(  min  e.  >  +  mr)/M 

i eE  i eE 

n  n 


<  I  P(  ma^  e.<a2-mnA,  -JKg^-g  )  e  II.) 
J=i  i  eE  J 


+  I  P(  min  e.  >  a  +  mj]/h,  S,  (R  -  g  )  e  II.) 

,  .  -  t  1  T<n  0  j 

)=’  t  eE  J 


<  £  P (  max  e .  <  a 

j=i  ieFJ  1  " 

n 


mr\/b,  "^(Bkn 


B0)  e  It.) 


10 


+  T  P(  min  e.  >  a ,  +  mn/i*,  £  (B,  -  BJ  e  II.) 

.  i  eFJ  '  ‘  V  °  J 

n 


<  y  P (  max  e .  <  a. 

■  i eFJ  '  '  2 

n 


-  mn/4) 


+  <  y  P(  min  e.  >  a,  +  rrm/4) 

~,‘l  icFJ  '  '  ' 

n 


<  2b(1  -  Amr|/4)  ^2°  <  2be  <  2b/n^ 


for  large  n  By  Borel-Cantelli  Lemma,  we  have,  with  probability  one. 


Q  (B  )  >  a  -  a  +  mn/2,  for  all  large  n, 
n  n  2  i 


Thus  for  k  <  k  ,  we  have 
0 


R,  ■  R,  -  q  (a  )  -  q  (a  )  -  (k  — k)  c 

k  k  n  Hen  n  n  0  n 

0  0 


>  mn/2  -  (k  -  k)  C  >  0 , 
~  0  n 


for  large  n,  since  C  0 


(2)  and  (3)  imply  that  k  is  strongly  consistent.  Summarize  the  above 
arguments,  we  get  the  following  theorem. 


Theorem  1.  Choose  C  satisfying 


-►  00 


1 1 


(i)  C  -  0. 

n 


(ii)  nC  /logn 


Suppose  the  four  Assumptions  given  at  the  beginning  of  this  section  are  true,  then 
k  -►  k.  as 


Proof.  Use  the  arguments  given  before  We  only  need  to  note  that  for  any 

sequence  of  C  satisfying  (i)  and  (ii),  we  can  always  choose  r  such  that 
n  n 


<i)'  r  / C  -  0, 
n  f  n 


(ii)'  nr  /logn  os 
n 

Q.  E  D 


3.  Consistency  of  k 

In  this  section,  we  shall  make  the  following  general  assumptions: 

Assumptiom  1  The  error  e ,  i  =  1.2,  ....  are  independent, 

Assumptiom  2  |a  |  <  a  ,  V  n  =>  P(e  e  [a  ,  a  ])  =  0  there  is  a  positive 

12  n  1  2 

constant  A  such  that  for  any  e  >  0  and  for  any  n,  we  have 

P  (e  e  [a  -  e.  a  ])  >  Ae 
n  2  2 

Assumptiom  3  .  Same  as  Assumptiom  3  in  Section  2 
Assumptiom  4  There  exists  a  positive  constant  m  such  that 


|  x  |  >  m ,  for  i  =  1,2, 


Now  let  us  estimate  Q  (g  )  Define 

n  n 


t 


<  y  P  (  max  e .  <  a, 

'  i-l  icV  ■  ‘  2 

n 


i  /  -t  .  \  5 « n  A  r  5  .  n  ,  2 

<  d  (1  -  Ar  )  1  <  de  n  1  <  d/n 

—  n  —  ~ 


for  large  n.  By  Borel-Cantelli  Lemma  we  have 


Q(&J  >  a  -  r  a .  s . 

n  n  2  n 


when  n  is  large  enough. 


Let  k  be  the  index  of  the  true  model  and  let  R  be  the  true  parameter 
o  0 


Then  obviously  we  have  for  p  >  k  >  k 


£ 

%  «„> 

& 

P 

s  « n(i 

m  *  * 

0  <  Q  (g  )  -  Q  (B.  )  <  r  ,  p  >  k  >  k  . 
'n  k  n  n  kn  n  0 

0 


If  we  take  C  such  that 

n 


C  -►  0,  C  /r  -►  oo 
n  n  n 


then  for  k  >  k 


(k  -  k.)C  +  Q  (&)  -  Q  (g,  ) 

On  n  kn  n  k  n 

0 


>  0 


for  all  large  n. 


Next,  we  consider  the  case  of  k  <  k  Denote 

o 


(M 


r\  “  |Bk  I  >  0 

o 

and  define 


E  -  {i  <  n,  £  (x.)  1  £  (B.  ~  BJ  <  -1/2) 

n  “  i  kn  0  ~ 

Split  S  into  b  disjoint  parts  n . II such  that  V  x,  y  e  II,  xy  >  1/2. 

P  lb  j  ~ 

Let  r  E  n  ,  j  =  1 . b.  Define  FJ  as  FJ  =  (i  <  n,  £(x )  £  >  275/280],  j  = 

J  _  i  n  _  n  -  i  j  ~ 

1 . b  By  Assumption  3,  there  exists  S2  >  0  such  that 

# (FJ)  >  6  n,  j  -  1 ,  ,b 
n  i 

It  is  easy  to  see  that  -£(B  -  B  )  e  II  and  i  e  FJ  imply  that 

kn  0  j  n 

£  (x. )  '£(B.  -  BJ  <  -1/2,  i.  e.  i  e  i 

i  kn  0  n 


For  i  e  E  ,  we  have 


15 


|x:(Bkn-60)|  -  |xiH6k„-B0ll*(xi)'«(Bk„-B0)|  i-T,/2 


Hence 


Q  (8,  )  >  maxe.  +  mn/2 
n  kn  “  -  i 

•  et 

n 


P  (Q  (R  )  <  a,  +  mnA) 
n  kn  2 


<  P  (  max  e  <  a  -  m T) /4) 
iE*n  ’  ‘ 


<  £  P  (  max  e.  <  a  -  mnA,  -£  (B  "  BJ  e  II.) 

,~i  c  knu  j 

J=  1  i  e E 

n 


<  7  P  (  maxe.  <  a  -  mnA,-£(B,  "BJ  e  II.) 
“  ,  .  rj  i  “  2  kn  0  j 

J=1  I  cF 


<  Y  P(  max  e.  <  a.-  mnA) 
'  ,-  l  i  cFJ  '  ‘  2 


<  b  (1  -  AmnA)  ^2°  <  b/n2 


for  large  n  By  Borel-Cantelli  Lemma,  we  have  with  probability  one,  when  n  large 


enough 


4.  General  Case 


In  this  section  we  consider  the  same  regression  model  (1)  But  the  problem 

we  are  going  to  solve  is  to  determine  the  subset  (or  the  model)  J  =  { 1  <  j  < 

<i  <  p}  such  that  B  £  0  if  and  only  if  j  e  J  We  make  the  same 

k  ~  j 

assumptions  as  given  in  previons  sections 

Of  course,  we  can  use  the  procedure  described  in  section  2  and  3  to 

determine  the  model  J  as  follows  For  each  permutation  tt  of  B  =  (B  ,  ,  B  )'. 

i  p 

similarly  rearranging  (x . x  )',  we  get  a  new  model  M  Under  this  model, 

i 1  P'  it 

A  A  /«. 

using  the  approach  given  in  section  2  and  3,  we  obtain  estimates  k  =  k~  =  min  k 

71  77  TT 

and  k  =  k-  =  min  k  and  let  J  =  (tiCI),  ...  7i(k)}  and  J  =  J  =  { tt(  1 ),  .  .  .  , 

77  77  77  1  1  1 

irlk)} ,  we  can  easily  prove  that,  by  using  Theorem  1  and  2,  -►  J,  a.  s.  and 

J,  a.  s 


An  alternative  method  to  estimate  J  is  given  as  follows:  Suppose  T  is  a 
subset  of  {1,  ,  p}  Consider  the  model  T: 

y  *  x  (T)  1 B  (T)  +  e  , 
n  n  n 


where  x  (T) 

i 


lx  ,  i  e  T)  and  8(T)  =  (B  .  j  e  T)'  Let 

j'  j 


Q  (T) 
n 


min  {  max  (y  .  -  X  .  (T)  '  B  O’) ) 
B  (T)  1<  i  <n 


and 


-  min  (y  .  -  x.  (T)  '  B  (T)  ) } 

,  .  i  i 

1  <  i  <n 


Q  (T)  ■  min  max  |y.  -  x .  (T)  '  B  (T)  I  . 
n  BfO  ><i<n  '  ' 


Define 


Kr  -  Q  (T)  +  #  (T)  C 
T  n  n 

and 


R-  -  Q  (T)  +  #  (T)  C 
J2 

A 

Choose  J  such  that 
2 


R~  *  min  R 
J2  T 


and  choose  J  such  that 

2 


R-  «=  min  R 
J2  T  T 

a  ^ 

We  can  also  prove  that  J  J,  a  s.  and  J  -►  J,  a.  s  .  However,  there  would  be 

2  2 

too  much  computation  involved  when  p  is  relatively  large.  In  the  first  case,  there  are 

totally  p!  permutations  whileas  in  the  second  there  are  2P  subsets  of  { 1 . p} 

In  light  of  this,  we  propose  another  approach  to  estimate  J  which  only  involves  p  + 
1  quantities  to  be  computed 

Now  let 

B  (j)  «  (B,.  .B-  ,.0  ,  B  • , , »  .6  )  ' 

i  j- 1  j+i  p 

and  define 

Q  (j)  *  min  {  max  (y .  -  x!  B  (j) ) 

n  »  , .  i  i 


min  (y  ,  -  x!  B  (j) )  } 


V 


•  1 

Cl 


B(j)  1<i<n 


Write 


R(n,j)  =  Q  (j)  -  Q  -  Cn 
n  p  n 


R(n,j)  =  Q  (j)  -  Q  -  C  . 

n  p  n 


We  choose 


=  { j  1 .  ■  ,  }  *  (j  :  R(n,j)  >  0} 

n 


Jn  *  { J’ i  •  •  •-!£  }  “  {j:  R(n,j)  > 

n 


Then  we  have  the  following  theorems 


Theorem  3.  Under  the  conditions  of  theorem  1,  we  have  that 


J  -*■  J ,  a .  s  . 
n 


where  model  J  =  {i  , 


}  is  the  true  one. 


Proof  If  j  e  J.  by  (3)  with  the  replacement  that  kQ  =  p  and  k  =  p-1,  we 


have  that  with  probability  one,  Rn.j)  >  0  for  all  large  n,  i.  e,  j  e  J  .  Hence,  when  n 


2vC&£  „v  ttS sj^iyS.  f  jZrSttS'. 


large  enough,  J  J  Conversely,  if  j  J,  using  the  same  argument  as  proving 
n 

theorem  1,  we  have 

R(n.j)  =  Q  (j)  -  Q  -  C 
n  p  n 

<0(1  ogn/n)  -  C  a .  s  . 


which  together  with  (ii)  implies  that 


R(n,j)  <0,  for  large  n. 


i.  e.  j  J  when  n  large  enough.  Therefore  J  J  which  completes  the  proof  of 

n  n 


Theorem  3. 


Theorem  4.  Under  the  conditions  of  theorem  2,  we  have  that 


J  J  i  a .  s  • 
n 


where  model  J  =  {j . j  }  is  the  true  one. 

1  k 

Proof  If  j  e  J.  by  (5)  with  the  replacement  that  =  p  and  k  =  p-1,  we 
have  that  with  probability  one,  R(n,j)  >  0  for  all  large  n,  i  e,  j  e  J  Hence,  when  n 

n 

large  enough,  J  J  Conversely,  if  i  J,  using  the  same  argument  as  proving 

n 

theorem  2,  we  have 


R(n,j)  -  Q  (j)  -  Q  -  C 
n  p  n 


<0(1  ogn/n)  -  C  a  •  s  . 


which  together  with  (ii)  implies  that 
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